Binary and graded relevance in IR evaluations--Comparison of the effects on ranking of IR systems

نویسنده

  • Jaana Kekäläinen
چکیده

In this study the rankings of IR systems based on binary and graded relevance in TREC 7 and 8 data are compared. Relevance of a sample TREC results is reassessed using a relevance scale with four levels: non-relevant, marginally relevant, fairly relevant, highly relevant. Twenty-one topics and 90 systems from TREC 7 and 20 topics and 121 systems from TREC 8 form the data. Binary precision, and cumulated gain, discounted cumulated gain and normalised discounted cumulated gain are the measures compared. Different weighting schemes for relevance levels are tested with cumulated gain measures. Kendall s rank correlations are computed to determine to what extent the rankings produced by different measures are similar. Weighting schemes from binary to emphasising highly relevant documents form a continuum, where the measures correlate strongly in the binary end, and less in the heavily weighted end. The results show the different character of the measures. 2005 Elsevier Ltd. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Survey of graded relevance metrics for information retrieval

A large number of metrics are available to evaluate the quality of rank web pages in information retrieval (IR). These metrics can be classified in different groups as follows: Binary Relevance, Graded Relevance, Rank Correlation Coefficient, and User Oriented Measures. Each group of metrics has difference characteristics. However, metrics that contains in the same group have the similar charac...

متن کامل

0 Logical Imaging and Probabilistic Information Retrieval

In Information Retrieval (IR), probabilistic modelling relates to the use of a retrieval model that ranks documents in decreasing order of their estimated probability of relevance to a user’s information need expressed by a query. In an IR system based on a probabilistic model, the user is always guided to examine first the documents which are the most likely to be relevant to his or her need. ...

متن کامل

Cumulated Gain-based Indicators of Ir Performance

Modern large retrieval environments tend to overwhelm their users by their large output. Since all documents are not of equal relevance to their users, highly relevant documents should be identified and ranked first for presentation to the users. In order to develop IR techniques to this direction, it is necessary to develop evaluation approaches and methods that credit IR methods for their abi...

متن کامل

Evaluating ADM on a Four-Level Relevance Scale Document Set from NTCIR

Most common effectiveness measures for Information Retrieval (IR) systems are based on the assumptions of binary relevance (either a document is relevant to a given query or it is not) and binary retrieval (either a document is retrieved or it is not). These assumptions are often questioned, since almost everybody agrees that relevance and retrieval are matter of degree (three or more categorie...

متن کامل

Toward Consistent Evaluation of Relevance Feedback Approaches in Multimedia Retrieval

Many different communities have conducted research on the efficacy of relevance feedback in multimedia information systems. Unlike text IR, performance evaluation of multimedia IR systems tends to conform to the accepted standards of the community within which the work is conducted. This leads to idiosyncratic performance evaluations and hampers the ability to compare different techniques fairl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Inf. Process. Manage.

دوره 41  شماره 

صفحات  -

تاریخ انتشار 2005